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SUMMARY Dynamic resource management has become an active area 
of research in the Cloud Computing paradigm. Cost of resources varies 
significantly depending on configuration for using them. Hence efficient 
management of resources is of prime interest to both Cloud Providers and 
Cloud Users. In this work we suggest a probabilistic resource provisioning 
approach that can be exploited as the input of a dynamic resource manage- 
ment scheme. Using a Video on Demand use case to justify our claims, we 
propose an analytical model inspired from standard models developed for 
epidemiology spreading, to represent sudden and intense workload varia- 
tions. We show that the resulting model verifies a Large Deviation Prin- 
ciple that statistically characterizes extreme rare events, such as the ones 
produced by "buzz/flash crowd effects" that may cause workload overflow 
in the VoD context. This analysis provides valuable insight on expectable 
abnormal behaviors of systems. We exploit the information obtained using 
the Large Deviation Principle for the proposed Video on Demand use-case 
for defining policies (Service Level Agreements). We believe these policies 
for elastic resource provisioning and usage may be of some interest to all 
stakeholders in the emerging context of cloud networking 
key words: Cloud Networking, Resource Management, Epidemic Model, 
Workload Generator, Large Deviation Principle, Service Level Agreements, 
Video on Demand, Buzz/ Flash Crowd 

1. Introduction 

Users of a Cloud Computing platform can have several num- 
bers of choices regarding server selection (some are com- 
pute intensive, some provide better I/O performance, some 
are superior in networking). Cloud provider such as Ama- 
zon offers many different server instances that differ in many 
aspects with respect to CPU speed, network bandwidth and 
memory capacity. Each of these instances provides a certain 
amount of dedicated resource and charges per instance-hour 
consumed [1]. A Service Provider finds it to be extremely 
difficult to optimize the best combination of servers to be de- 
ployed in a Cloud for his business on a certain application. 
This problem differs from the concept of traditional dis- 
tributed computing (like Grid), since the numbers of servers 
are virtually unlimited but bandwidth is limited. The choice 
of deployment of resources can be dynamically tuned using 
cloud virtualization, that abstracts the IT resources to allow 
communication and control on-line. Cost of resources varies 
significantly depending on server types and Cloud Service 
Providers. 

In most applications, the amount of IT resource that is ac- 
tually used, is a highly variable quantity that follows the 
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instantaneous activity, and in particular the volume of ex- 
changed traffic when network infrastructures are concerned. 
Depending on the type of application, the generated work- 
load can be a highly varying process that turns difficult 
to find an acceptable trade-off between an expensive over- 
provisioning able to anticipate peak loads and a sub per- 
forming resource allocation that does not mobilize enough 
resources. To bypass this challenge, dynamic bandwidth al- 
location is an original approach that we chose to investigate 
in the context of network virtualization. We aim to demon- 
strate the proof of concept for the case of a Video on De- 
mand (VoD) system by adaptively tuning the provisioned 
bandwidth to the current application workload. In this paper 
we have resorted to probabilistic provisioning of resource 
management; however in some situations it can be used to 
anticipate resource requirements that can serve as inputs for 
dynamic resource allocation. 

Our work attempts to capture some properties that describe 
user behaviors or workload generating mechanism of the 
system and fits them to a mathematical model satisfying par- 
ticular properties. We leverage these properties to derive a 
probabilistic assumption on the mean workload of the sys- 
tem at different time resolutions. Embedding the notion of 
time scale is very important since time scale is by essence 
intrinsic to dynamicity. In this study we build our system us- 
ing epidemic models where Markovian models are widely 
used and happen to satisfy to the specific property men- 
tioned above. 

Epidemic information dissemination has been an active area 
of research in distributed systems, such as Peer-to-Peer 
(P2P) or VoD systems. In [2], it has been already demon- 
strated that the epidemic algorithms can be used as an ef- 
fective solution for information dissemination in the P2P 
systems as deployed on Internet or ad-hoc networks. The 
authors of [3] studied random epidemic strategies like the 
random peer, latest useful chunk algorithm to achieve op- 
timal information dissemination. However the most rele- 
vant work to our study is derived in [4] where the authors 
proposed an approach to predict workload for cloud clients. 
They used auto-scaling algorithm for resource provisioning 
and validated the result with real-world Cloud client appli- 
cation traces. Our approach encompasses both constructive 
Markovian model to reproduce epidemic information dis- 
semination and workload provisioning aspects. However, 
we insist on the fact that its originality stems from the anal- 
ysis of the Large Deviation property of the proposed Marko- 
vian model. The resulting characterization can be viewed as 
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a multi-resolution extension of the classical steady-state dis- 
tribution for the observable mean value of the random pro- 
cess over different aggregated time scales. 
After constructing the Markovian mathematical model, we 
propose two possible and generic ways to exploit these in- 
formation in the context of probabilistic resource provision- 
ing. They can serve as the input of resource management 
functionalities of the Cloud environment. It is evident that 
we can not define elasticity without the notion of a time 
scale; the Large Deviation Principle (LDP) is capable of 
automatically integrating the time resolution in automatic 
description of the system. It is to be noted that Markovian 
processes do satisfy the LDP, but so do some other mod- 
els as well. Hence, our proposed probabilistic approach is 
very generic and can adapt to address any provisioning is- 
sues, provided the resource volatility can be resiliently rep- 
resented by a stochastic process for which the LDP holds 
true. 

The rest of the paper is organized as follows. In Section 
2 we discuss the VoD system as our use case, followed 
by a Markovian description of the model in the Section 3. 
Section 4 presents Large Deviation Principle. We discuss 
the numerical interpretations in Section 5. Section 6 deals 
with the probabilistic provisioning scheme, derived from the 
Large Deviation Spectrum for our use case followed by the 
conclusion in Section 7. 

2. Use Case: Video on Demand (VoD) 

A VoD service delivers video contents to consumers on re- 
quest. According to Internet usage trends, users are increas- 
ingly getting more involved in the VoD and this enthusiasm 
is likely to grow. A popular VoD provider like Netflix ac- 
counts for around 30 percent of the peak downstream traf- 
fic in the North America and is the "largest source of In- 
ternet traffic overall" [5]. In a VoD system, consumers are 
video clients who are connected to a Network Provider. The 
source video content is managed and distributed by a Ser- 
vice Provider from a central data centre. With the evolu- 
tion of Cloud Computing and Networking, the service in a 
VoD system can be made more scalable by dynamically dis- 
tributing the caching/transcoding servers across the network 
providers. Video service providers interact with the network 
service providers and describe the virtual infrastructures re- 
quired to implement the service (like the number of servers 
required, their placements and clustering of resources). The 
resource provider reserves resource for certain time period 
and may change it dynamically depending on resource re- 
quirement. Such a dynamic approach brings benefits of cost 
saving in the system through dynamic resource provisioning 
which is important for service providers as VoD workload 
is highly variable by nature. However, since the virtual re- 
sources used by Cloud Networking have a set-up time which 
is not negligible, analysis and provisioning of such a system 
can be very critical from the operators perspective (capex 
versus opex trade-off). Figure 1 shows a VoD schematic 
where the back-end server is connected to the data centre 



and the transcoding (caching) servers are placed across the 
network providers. 



VoD Database & 
Back-end server 




T: Transcoding Servers 
C: Video Clients 



Fig. 1 Basic schematics of a VoD system with transcod- 
ing/caching servers 

Since VoD has stringent streaming rate requirements, 
each VoD provider needs to reserve a sufficient amount 
of server outgoing bandwidth to sustain continuous media 
delivery. When multiple VoD providers (such as Netflix) 
are on board to use cloud services from cloud providers, 
there will be a market between VoD providers and cloud 
providers, and commodities to be traded in such a market 
consist of bandwidth reservations, so that VoD streaming 
performance can be guaranteed. 

As a buyer in such a market, each VoD provider can pe- 
riodically make reservations for bandwidth capacity to sat- 
isfy its random future demand. A simple way to achieve this 
is to estimate expectation and variance of its future demand 
using historical demand information, which can easily be 
obtained from cloud monitoring services. As an example, 
Amazon Cloud- Watch provides a free resource monitoring 
service to Amazon Web Service customers for a given fre- 
quency. Based on such estimates of future demand, each 
VoD provider can individually reserve a sufficient amount 
of bandwidth to satisfy in average its random future demand 
within a reasonable confidence. However, this information 
is not helpful in case of a "buzz" or a "flash crowd" when 
a video becomes popular very quickly leading to a flood of 
user requests on the VoD servers. Following is one example 
of "buzz" where interest over a video"Star Wars Kid" [6] 
grew very quickly within a very short timespan. According 
to [7] it was viewed more than 900 million times within a 
short interval of time making it one of the top viral videos. 
Figure 2 plots the original server logs for the Star Wars Kid 
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Video Downloads (April 29 • July 29) 




Fig. 2 Video server workload: time series displaying a char- 
acteristic pattern of flash crowd (buzz effect). Trace ob- 
tained from URL: http://waxy.org/20O8/85/star_wars_ 
ki d_the_dat a_dump / 



debacle [6]. 

In situations like the one described in Figure 2, vari- 
ance estimation or more generally steady state distribution 
can not explain burstiness of such event as time resolution 
is excluded from the description. The LDP, by virtue of its 
multi-resolution extension of the classical steady-state dis- 
tribution, can describe the dynamics of rare events like this, 
which we believe can be of some interest for the VoD service 
providers. 

3. Markov Model to describe the behavior of the users 

Epidemic models commonly subdivide a population into 
several compartments: susceptible (noted S) to designate 
the persons who can get infected, and contagious (noted 
C) for the persons who have contracted the disease. This 
contagious class can further be categorized into two parts: 
the infected subclass (/) corresponding to the persons who 
are currently suffering from the disease and can spread it, 
and the recovered class (R) for those who got cured and 
do not spread the disease anymore [8]. There can be more 
categories that fall outside the scope of our current work. 
In these models (N s (t)) t >o, (N/0))_o and (N R (t)) t >o are 
stochastic processes representing the time evolution of sus- 
ceptible, infected and recovered populations respectively. 
Similarly, information dissemination in a social network can 
be viewed as an epidemic spreading (through gossip), where 
the "buzz" is a special event where interest for some par- 
ticular information increases drastically within a very short 
period of time. Following the lines of related works, we 
claim that the above mentioned epidemic models can appro- 
priately be adapted to represent the way information spreads 
among the users in a VoD system. In the case of a VoD sys- 
tem, infected / refers to the people who are currently watch- 
ing the video and can spread the information about it. In our 
setting, / directly represents the current workload which is 
the current aggregated video requests from the users. Here, 
we consider the workload as the total number of current 
viewers, but it can also refer to total bandwidth requested at 



the moment. The class R refers to the past viewers. In con- 
trast to the classical epidemic case, we introduce a memory 
effect in our model, assuming that the R compartment can 
still propagate the gossip during a certain random latency 
period. Then, we define the probability within a small time 
interval df, for a susceptible individual to turn into an active 
viewer, as follows: 



♦_ = (/+ (NKt) + N R (t)){3)dt + o(dt) 



(1) 



where /3 > is the rate of information dissemination per unit 
time and I > fixes the rate of spontaneous viewers. The 
instantaneous rate of newly active viewers in the system at 
time t is thus: 



A(t) = l + (N I (t)+N R (t))p. 



(2) 



Equation (2) corresponds to the arrival rate A(t) of a non- 
homogeneous (state dependant) Poisson process. This rate 
varies linearly with Nj(t) and N R {t). 

To complete our model we assume that the watch time 
of a video is exponentially distributed with rate y. As al- 
ready mentioned, it also deems reasonable to consider that 
a past viewer will not keep propagating the gossip about a 
video indefinitely, but remains active only for a latency ran- 
dom period that we also assume exponentially distributed 
with rate fi (in general fi <K y). Another important con- 
sideration of the model is the maximum allowable viewers 
(/max) at any instant of time. This assumption conforms 
to the fact that the resources in the system are physically 
limited. For the sake of numerical tractability and without 
loss of generality, we also assume the number of past (but 
spreading rumour) viewers at a given instant to be bounded 
by a maximum value (Rmax)- With these assumptions, and 
posing (Ni(f) = i, N R (t) = r) the current state of the Markov 
processes, the probability that the process reaches a different 
state (/' < /max, r' < /?max) at time t + dt (df being small) 
reads: 

V(i',r'\i,r) (3) 
= (/ + (z + r)P)dt + o(df) for (z" = i + l,r' = r), 
= (yi)dt + o(dt) for (/ = r + 1, z" = i - 1), 

= (jur)df + o(dt) for (/ = r - 1, z" = z), 

= o(df) otherwise. 

This process defining the evolution of the current viewer and 
past viewer populations is a finite and irreducible Markov 
chain. It is to be noted that / > precludes the process to 
reach an absorbing state. This chain is ergodic and admits a 
stationary regime. 

Above mentioned descriptions define the mechanism of in- 
formation dissemination in the community in normal situa- 
tions. A buzz event differs from this situation by a sudden 
increase of the dissemination rate j3. In order to adapt the 
model to buzz we resort to Hidden Markov Model (HMM) 
to be able to reproduce the change in /3. Without loss of 
generality we consider only two states. One with dissemina- 
tion rate fi - P\ corresponds to the buzz-free case described 
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above, and another hidden state corresponding to the buzz 
situation, where the value of [} increases significantly and 
takes on a value /?2 » f}\. Transitions between these two 
hidden and memory less Markov states occur with rates a\ 
and fl2 respectively (see Figure 3). These rates characterize 
the buzz in terms of frequency, magnitude and duration. 



. • ■ ' ' aj 




Fig. 3 Markov chain diagram representing the evolution of the 
Current viewers (z) and Past Viewers (r) populations with a Hidden 
Markov Model. 



4. Large Deviation Principle 

Consider a continuous-time Markov process (X r ) t >o, tak- 
ing values in a finite state space S, of rate matrix 
A = (Aij)i £ sjes- In our case X is a vectorial process 
X(t) = (Ni(t),N R (t))yt > 0, and S = {0, ■ • • , /max) x 
{(),•■■ ,/?max)- If me rate matrix A is irreducible, then the 
process X admits a unique steady-state distribution n satis- 
fying ttA = 0. Moreover, by Birkhoff ergodic theorem, it 
is known that for any mapping <1> : S — > R, the sample 
mean of <1>(X) at scale t, i.e. 1/t ■ L <t>(X s )ds converges 
almost-surely towards the mean of <J>(X) under the steady- 
state distribution, as r tends to infinity. The function €> is 
often called the observable. In our case, as we are inter- 
ested in the variations of the current number of users Ni(t), 
<t> will simply be the function that selects the first compo- 
nent: (b(Ni(t),N R (t)) - N;(t). The large deviations principle 
(LDP), which holds for irreducible Markov processes on a 
finite state space [9], gives a efficient way to estimate the 
probability for the sample mean calculated over a large pe- 
riod of time t to be around a value net that deviates from 
the almost-sure mean: 

lim lim - log P { f ®(X,)ds e [a - e, a + e] \ = f(a). (4) 

[JO J ' 

The mapping a i-> /(or) is called the large deviations spec- 
trum (or the rate function). For a given function O, it is pos- 
sible to compute the theoretical large deviations spectrum 



from the rate matrix A as follows. One first computes, for 
each values of q e R, the quantity A{q) defined as the princi- 
pal eigenvalue (i.e., the largest) of the matrix with elements 
Ajj + qSifoij) (6ij = 1 if i = j and otherwise). Then the 
large deviations spectrum can be computed as the Legendre 
transform of A: 

f(a) = sup {qa - A(q)} , Va € R. (5) 

As described in Equation(4), a T = (i) T corresponds in 
our study case, to the mean number of users i observable 
over a period of time of length t and f(a) relates to the prob- 
ability of its occurrence as follows: 

P«/>r ~ a] ~ e T - f(a) . (6) 

Interestingly also, if the process is strictly stationary 
{i.e. the initial distribution is invariant) the same large de- 
viation spectrum /(•) can be estimated from a single trace, 
provided that it is "long enough" [10]. We proceed as fol- 
lows: At a scale t, the trace is chopped into k T intervals 
[Ij,T - l(j ~ 1) T > j T L j - 1, ...,&!■} of length t and we have 
(almost-surely), for all a € R: 

j f Ijr ®{Xs)ds e[a-e T ,a + e T ]} 

Ma, e T ) = - log r _. 

t k T {/) 

and lim M a > fj) = f(a). 

T— >00 

In practice, for the empirical estimation of the large de- 
viations spectrum, we use a similar estimator as the one de- 
rived in [11] and also used in [12]. At scale r, we compute 
for each q e R the values of A' T (q) and A"(q), where A T (q) = 

t~ 1 log (k; 1 exp (q JJ <D(X s )dijj. Then, for each value 

of t, we count the number of intervals Ij T verifying the con- 
dition in expression (7) and estimate the scale-dependant 
empirical log-pdf f T {a, e T ), with the adaptive choices derived 
in [11]: 



a T = A! r (q) and e r =y— L^. (8) 

Let us now illustrate the LDP in the context of the spe- 
cific VoD use case, where X would correspond to (i, r), the 
bi-variate Markov process. <f>(X) is i, the observable and 
L <t>(Xs )ds — (j')t corresponds to the average number of 
users with a period r. 

5. Numerical Interpretations 

We simulate the proposed workload model and generate two 
time series corresponding to the buzz and to the buzz free 
situations. We developed our simulator in C programming 
environment, by creating several parallel child processes 
(client) that communicate with a parent process (server) to 
disseminate information. The child process is in any of the 
susceptible, active viewers or past viewers states at a par- 
ticular instant of time. When it is in the past viewers state 
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it randomly chooses another process (using process id) and 
communicates with the parent to infect him. The parent pro- 
cess maintains a table with the status (which state a process 
is in) of each process. If the chosen process is not already in 
active viewers or in past viewers states it gets infected. We 
have chosen UDP socket-pairs in order to facilitate commu- 
nication between the processes. For fair and consistent com- 
parisons, we carefully tuned the values of the model param- 
eters so as to obtain the same mean workload for both result- 
ing traces. In Figure 4(a) the bursty transients represent the 
buzz effect. It reflects sudden and sharp increases of work- 
load due to intense dissemination of popular videos. The 
zoomed in view displayed in Figure 4(b) shows the charac- 
teristic pattern of a buzzy transient, that is to say a sharp 
increase (J3i — > fa) and a slow decrease (owing to fa — > fa 
and memory effect of the model). 

This clear evidence of our model's ability to captur- 
ing the buzz effect is moreover confirmed by the numer- 
ical steady-state distributions P(/) displayed in Figure 5. 
As compared to the buzz-free case, the buzz distribution 
presents a thicker tail indicating that the instantaneous work- 
load i takes on larger values with higher probability. To in- 
clude the notion of time scale in the results one needs to 
consider along with the steady-state distribution the time co- 
herence of the underlying process, viz. it's covariance struc- 
ture. However, except for the trivial case of uncorrelated 
processes deriving the statistics of the local average process 
at any resolution is a hard problem in general. 

Intrinsically, Large Deviation Principle naturally em- 
beds this time scale notion into the statistical description 
of the aggregated observable at different time resolutions. 
As expected, the theoretical LD spectra displayed in Fig- 
ure 6(a) reach their maximum for the same mean number 
of users. This apex is the almost sure value as described in 
Section 4. As the name suggests almost sure workload (a a . s ) 
corresponds to the mean value that we almost surely observe 
on the trace. More interestingly though, the LD spectrum 
corresponding to the buzz case, spans over a much larger in- 
terval of observable mean workloads than that of the buzz- 
free case. This remarkable support widening of the theoret- 
ical spectrum shows that LDP can accurately quantify the 
occurrence of extreme, yet rare events. 

Plots (b)-(c) of Figure 6 compare theoretical and em- 
pirical large deviation spectra obtained for the two traces. 
For each given scale (t) the empirical estimation procedure 
yields one LD estimate. These empirical estimates at dif- 
ferent scales superimpose for a given range of a. This is 
reminiscent of the scale invariant property underlying the 
large deviation principle. If we focus on the supports of 
the different estimated spectra, the larger the time scale t is, 
the smaller becomes the interval of observable value of a. 
This is coherent with the fact that for a finite trace-length 
the probability to observe a number of current viewers, that 
in average, deviates from the nominal value (a a ,„) during a 
period of time (r) decreases exponentially fast with t. To fix 
the ideas, the estimates of plot (c), indicate that for a time 
scale r = 400 sec, the maximum observable mean number 
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Fig. 4 Plot (a): Workload Ni(t) generated according to the model 
depicted in Figure 3 (For the buzz case: f}\ = 0. 1 , fa = 0.8, y = 0.7, 
fi = 0.3, Z = 1.0, a\ = 0.006 and ai = 0.6. For the buzz-free case: 
/3j = p 2 = p = o.l, y = 0.7, yu = 0.3, / = 1.0). In both cases, 
: 60. Plot (b): Zoomed in view of a buzz event. 



of users is around 5 with probability 2 400 ( -° 02) * 35.10" 5 
(point A), while it increases up to 9 with the same probabil- 
ity (2 10 ° (-° 08 )) for r = 100 sec. (point B). 

6. Probabilistic Provisioning 

Retuning to our VoD use case, we now sketch two possible 
schemes for exploiting the Large Deviation description of 
the system to dynamically provision the allocated resources: 

• Identification of the reactive time scale for reconfigu- 
ration: Find a relevant time scale that realizes a good 
trade-off between the expectable level of overflow asso- 
ciated to this scale and a sustainable opex cost induced 
by the resources reconfiguration needed to cope with 
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Fig. 6 Large Deviations spectra corresponding to the traces of Figure 4. (a) Theoretical 
spectra for the buzz free (blue) and for the buzz (red) scenarii. (b) & (c) Empirical estima- 
tions of f(a) at different scales from the buzz free and the buzz traces, respectively. 




1 o ' 1 1 1 1 

5 10 15 20 25 30 

1 (number of current viewers) 

Fig. 5 Steady-state probabilities for the number of current view- 
ers with buzz and buzz-free scenarios (Y-axis in log-scale). 

the corresponding flash crowd. 

• Link capacity dimensioning: Considering a maximum 
admissible loss probability, find the safety margin that 
it is necessary to provision on the link capacity, to guar- 
antee the corresponding QoS. 

6. 1 Identification of the reactive time scale for reconfigu- 
ration 

We consider the case of a VoD service provider who wants 
to determine the reactivity scale at which it needs to recon- 
figure its resource allocation. This quantity should clearly 
derive from a good compromise between the level of con- 
gestion (or losses) it is ready to undergo, i.e. a tolerable per- 
formance degradation, and the price it is willing to pay for 
a frequent reconfiguration of its infrastructure. Let us then 
assume that the VoD provider has fixed admissible bounds 
for these two competing factors, having determined the fol- 
lowing quantities: 

• a* > a a s .: the deviation threshold beyond which it be- 



comes worth (or mandatory) considering to reconfigure 
the resource allocation. This choice is uniquely deter- 
mined by a capex performance concern. 
• cr*: an acceptable probability of occurrence of these 
overflows. This choice is essentially guided by the cor- 
responding opex cost. 

Let us moreover suppose, that the LD spectrum f(a) 
of the workload process was previously estimated, either 
by identifying the parameters of the Markov model used 
to describe the application, or empirically from collected 
traces. Then, recalling the probabilistic interpretation we 
surmised in relation (6), the minimum reconfiguration time 
scale r* for dynamic resource allocation, that verifies the 
sought compromise, is simply the solution of the following 
inequality: 

Jr»oo 
e TfAa> da > cr*}, (9) 
a* 

with f T (a) as defined in expression (7). 

From a more general perspective though, we can see 
this problem as an underdetermined system involving 3 un- 
knowns (ff*,T* and cr*) and only one relation (9). Therefore, 
and depending on the sought objectives, we can imagine to 
fix any other two of these variables and to determine the re- 
sulting third so that it abides with the same inequality as in 
expression (9). 

6.2 Link capacity dimensioning 

We now consider an architecture dimensioning problem 
from the infrastructure provider perspective. Let us assume 
that the infrastructure and the service providers have come 
to a Service Level Agreement (SLA), which among other 
things, fixes a tolerable level of losses due to link conges- 
tion. We start considering the case of a single VoD server 
and address the following question: What is the minimum 
link capacity C that has to be provisioned such that we meet 
the negotiated QoS in terms of loss probability? Like in the 
previous case, we assume that the estimated LD spectrum 
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f(a) characterizing the application has been priorly identi- 
fied. A rudimentary SLA would be to guarantee a loss free 
transmission for the normal traffic load only: this loose QoS 
would simply amount to fix C to the almost sure workload 
tt a . s .. Naturally then, any load overflow beyond this value 
will result in goodput limitation (or losses, if there is no 
buffer to smooth out exceeding loads). For a more demand- 
ing QoS, we are led to determine the necessary safety mar- 
gin Co > one has to provision above a a s to absorb the ex- 
act amount of overruns corresponding to the loss probability 
pi oss that was negotiated in the SLA. From the interpretation 
of the large deviation spectrum provided in Section 4, this 
margin Co is determined by the resolution of the following 
inequality: 



Co 



: e T f(a) dTda < p loss 

da < pi oss (10) 



J a. 



gTmax-fia) _ e T mln -f(a) 



/(<*) 



In this expression, j min is typically determined by the size 
Q of the buffers that is usually provisioned to dampen the 
traffic volatility. In that case, 



Q 



a-(a a .s. +C )' 



(11) 



corresponds to the maximum burst duration that can be 
buffered without causing any loss at rate a > C = a a , s , + Cq. 
As for r max , it relates to the maximum period of reservation 
dedicated to the application. Most often though, the char- 
acteristic time scale of the application exceeds the dynamic 
scale of flash crowds by several orders of magnitude, and 
r max can then simply be set to infinity. With these particular 
integration bounds, Equation (10) simplifies to 



C = C 



r 

• a a . s . ■ J 



-1 JL 

e«-c 

f(a) 



f(a) 



da < pio 



(12) 



ease and without loss of generality, we moreover suppose 
that they are identically distributed and modeled by the same 
LD spectrum f {k \a) = f(a) with the same nominal work- 
load afl = or a . s ., k = 1,...K. Then, following the same 
reasoning as in the previous case of a single server, the max- 
imum number K of servers reads: 



K = arg max (C - K ■ ar a . s .) < Co, 

K 



(13) 



where the safety margin Co is defined as in expression (12). 

Then, depending on the agreed Service Level Agree- 
ments, the infrastructure provider can easily offer different 
levels of probability losses (QoS) to its VoD clients, and 
adapt the number of hosted servers, accordingly. 



k 




Co 

f 
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S. 

*~ 



Time 

Fig. 7 Dimensioning K, the number of hosted servers sharing a 
fixed capacity link C. The safety margin Co is determined accord- 
ing to the probabilistic loss rate negotiated in the Sendee Level 
Agreement between the infrastructure provider and the VoD ser- 
vice provider. 



a decreasing function of C, which can be solved using a sim- 
ple bisection technique. 

As long as the server workload remains below C, this re- 
source dimensioning guarantees that no loss occurs. All 
overrun above this value will produce losses, but we ensure 
that the frequency (probability) and duration of these over- 
runs are such that the loss rate remains conformed to the 
SLA. The proposed approach clearly contrasts with resource 
over-provisioning that does not seek at optimizing the capex 
to comply with the loss probability tolerated in the SLA. 

The same provisioning scheme can straightforwardly 
be generalized to the case of several applications sharing a 
common set of resources. To fix the idea, let us consider 
an infrastructure provider that wants to host K VoD servers 
over the same shared link. A corollary question is then to 
determine how many servers K can the fixed link capacity C 
support, while guaranteeing a prescribed level of losses. If 
the servers are independent, the probability for two of them 
to undergo a flash crowd simultaneously is negligible. For 



7. Conclusion 

The objective of this work is to harness probabilistic meth- 
ods for resource provisioning in the Clouds. We illustrate 
our purpose with a Video on Demand scenario, a character- 
istic service whose demand relies on information spreading. 
Adopting a constructive approach to capture the users' be- 
havior, we proposed a simple, concise and versatile model 
for generating the workload variations in such context. A 
key-point of this model is that it permits to reproduce the 
workload time series with a Markovian process, which is 
known to verify a Large Deviation Principle (LDP). This 
particularly interesting property yields a large deviation 
spectrum whose interpretation enriches the information con- 
veyed by the standard steady state distribution: For a given 
observation (workload trace), LDP allows to infer (theoret- 
ically and empirically) the probability that the time average 
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workload, calculated at an arbitrary aggregation scale, devi- 
ates from its nominal value (i.e. almost sure value). 
We leveraged this multiresolution probabilistic description 
to conceptualize two different management schemes for dy- 
namic resource provisioning. As explained, the rationale 
is to use large deviation information to help network and 
service providers together to agree on the best capex-opex 
trade-off. Two major stakes of this negotiation are: (i) to de- 
termine the largest reconfiguration time scale adapted to the 
workload elasticity and (ii) to dimension VoD server so as 
to guarantee with upmost probability the Quality of Service 
imposed by the negotiated Service Level Agreement. 
More generally though, the same LDP based concepts can 
benefit any other "Service on Demand" scenarii to be de- 
ployed on dynamic cloud environments. 
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